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Abstract 

The  Behavior-Based  Access  Control  (BBAC)  project  seeks 
to  address  the  increasingly  sophisticated  attacks  and 
attempts  to  exfiltrate  or  corrupt  critical  sensitive 
information.  BBAC  uses  statistical  machine  learning 
techniques  (clustering  and  classification)  to  make 
predictions  about  the  intent  of  actors  establishing  TCP 
connections  and  HTTP  requests.  Administrators  will  need 
to  assign  new  computers  to  appropriate  clusters,  to  be 
alerted  about  changes  in  cluster  assignments,  to  select 
classifiers  and  settings  to  use,  and  to  monitor  accuracy  of 
the  system.  We  discuss  the  requirements  and  our  current 
approach  in  this  Interactive  ML  application  domain. 
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ACM  Classification  Keywords 

H.5.2  [Information  interfaces  and  presentation  (e.g., 

H Cl )] :  User-centered  design. 

Introduction 

Current  cyber  security  monitoring  systems  have  several 
shortcomings:  they  1)  have  narrow  focus  and  are 
signature-based,  2)  use  static  policies,  and  3)  don't  use 
audit  data  for  analysis  until  it  is  too  late.  This  leaves 
systems  vulnerable  to  sophisticated  attacks  including 
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O-day  and  insider  attacks.  Behavior-Based  Access  Control 
(BBAC)  [2]  seeks  to  address  these  issues  by  performing 
analysis  at  multiple  layers,  including  the  network  layer, 
application  layer,  and  document  layer.  BBAC  uses 
clustering  to  form  groups  of  computers  that  have  similar 
behavior.  Classifiers  are  then  trained  for  each  cluster. 
Currently  we  are  using  both  HTTP  and  TCP  logs  in  our 
analyses.  The  techniques  we  use  for  HTTP  data 
processing  are  similar  to  those  used  by  Ma  et  al.  [1]  to 
detach  malicious  URLs.  Our  architecture  is  based  on  a 
cloud  framework  that  will  allow  the  clustering  and 
classifiers  to  be  trained  at  least  once  a  day  and  will  allow 
rapid  classification  of  computer  behavior. 

User  Interaction 

The  users  for  BBAC  will  be  system  administrators 
interacting  with  both  the  training  side  of  the  system  as 
well  as  the  real-time  monitoring  part  of  the  system. 

During  the  training  phase,  the  system  will  re-analyze  the 
clusters  and  build  new  classifiers.  Changes  in  clustering 
might  trigger  user  notifications  as  well  as  changes  in 
classifier  performance.  Our  system  will  compare  its 
performance  using  newly  trained  classifiers  against  the 
previous  baseline  and  note  changes  in  behavior. 

Depending  on  operating  conditions,  different  latency,  and 
true  positive  and  false  positive  rates  may  be  desired.  As 
system  administrators  are  unlikely  to  be  experts  in 
machine  learning,  a  key  question  we  will  need  to  answer  is 
how  to  best  present  the  accuracy  of  the  re-trained 
classifiers.  Additionally,  our  system  will  train  multiple 
classifiers  with  different  settings.  A  second  key  question  is 
what  data  should  be  provided  to  enable  the  user  to  select 
a  classifier. 

New  computers  will  be  added  to  the  system  and  will  need 
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to  be  assigned  to  a  cluster.  Initially  there  will  not  be  any 
behavioral  data  for  a  new  computer,  therefore  the 
administrator  will  have  to  assign  it  to  a  cluster  manually. 
Thus  clusters  must  have  user-friendly  descriptions  that 
permit  manual  cluster  assignments.  Once  the  new 
computer  has  been  active  long  enough,  it  can  be 
automatically  re-clustered.  These  and  other  changes  in 
clusters  should  be  approved  by  the  administrator. 

Alert  information  must  be  displayed  to  the  administrator 
together  with  some  notion  of  the  accuracy  and  severity  of 
the  alert.  In  some  cases  the  system  may  be  able  to 
immediately  curtail  the  user's  action  -  e.g.,  block  an 
HTTP  request  -  while  other  cases  might  require  human 
review.  The  appropriate  course  of  action  will  inevitably 
depend  on  the  operating  context  of  the  system  (as 
controlled  by  the  administrator). 

Finally  the  administrator  should  be  able  to  see  information 
about  the  system  state  -  our  cloud  based  architecture  will 
allow  additional  resources  to  be  used  for  both  training  and 
classification.  The  administrator  should  be  able  to  control 
these  settings  to  adjust  the  system  performance. 

In  order  to  make  the  BBAC  system  usable,  presenting  key 
data  to  the  administrator  is  vital.  The  administrator  must 
be  able  to  assign  new  computers  to  clusters,  select 
between  classifiers,  approve  changes  to  clusters,  and  be 
alerted  to  suspicious  behavior. 
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